{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "vscode": {
     "languageId": "markdown"
    }
   },
   "source": [
    "# Reranking for Enhanced RAG Systems\n",
    "\n",
    "This notebook implements reranking techniques to improve retrieval quality in RAG systems. Reranking acts as a second filtering step after initial retrieval to ensure the most relevant content is used for response generation.\n",
    "\n",
    "## Key Concepts of Reranking\n",
    "\n",
    "1. **Initial Retrieval**: First pass using basic similarity search (less accurate but faster)\n",
    "2. **Document Scoring**: Evaluating each retrieved document's relevance to the query\n",
    "3. **Reordering**: Sorting documents by their relevance scores\n",
    "4. **Selection**: Using only the most relevant documents for response generation"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Setting Up the Environment\n",
    "We begin by importing necessary libraries."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import fitz\n",
    "import os\n",
    "import numpy as np\n",
    "import json\n",
    "from openai import OpenAI\n",
    "import re"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Extracting Text from a PDF File\n",
    "To implement RAG, we first need a source of textual data. In this case, we extract text from a PDF file using the PyMuPDF library."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "def extract_text_from_pdf(pdf_path):\n",
    "    \"\"\"\n",
    "    Extracts text from a PDF file and prints the first `num_chars` characters.\n",
    "\n",
    "    Args:\n",
    "    pdf_path (str): Path to the PDF file.\n",
    "\n",
    "    Returns:\n",
    "    str: Extracted text from the PDF.\n",
    "    \"\"\"\n",
    "    # Open the PDF file\n",
    "    mypdf = fitz.open(pdf_path)\n",
    "    all_text = \"\"  # Initialize an empty string to store the extracted text\n",
    "\n",
    "    # Iterate through each page in the PDF\n",
    "    for page_num in range(mypdf.page_count):\n",
    "        page = mypdf[page_num]  # Get the page\n",
    "        text = page.get_text(\"text\")  # Extract text from the page\n",
    "        all_text += text  # Append the extracted text to the all_text string\n",
    "\n",
    "    return all_text  # Return the extracted text"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Chunking the Extracted Text\n",
    "Once we have the extracted text, we divide it into smaller, overlapping chunks to improve retrieval accuracy."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "def chunk_text(text, n, overlap):\n",
    "    \"\"\"\n",
    "    Chunks the given text into segments of n characters with overlap.\n",
    "\n",
    "    Args:\n",
    "    text (str): The text to be chunked.\n",
    "    n (int): The number of characters in each chunk.\n",
    "    overlap (int): The number of overlapping characters between chunks.\n",
    "\n",
    "    Returns:\n",
    "    List[str]: A list of text chunks.\n",
    "    \"\"\"\n",
    "    chunks = []  # Initialize an empty list to store the chunks\n",
    "    \n",
    "    # Loop through the text with a step size of (n - overlap)\n",
    "    for i in range(0, len(text), n - overlap):\n",
    "        # Append a chunk of text from index i to i + n to the chunks list\n",
    "        chunks.append(text[i:i + n])\n",
    "\n",
    "    return chunks  # Return the list of text chunks"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Setting Up the OpenAI API Client\n",
    "We initialize the OpenAI client to generate embeddings and responses."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Initialize the OpenAI client with the base URL and API key\n",
    "client = OpenAI(\n",
    "    base_url=\"http://localhost:11434/v1/\",\n",
    "    api_key=\"ollama\"  # Ollama doesn't require a real API key, but the client needs a value\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Building a Simple Vector Store\n",
    "To demonstrate how reranking integrate with retrieval, let's implement a simple vector store."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "class SimpleVectorStore:\n",
    "    \"\"\"\n",
    "    A simple vector store implementation using NumPy.\n",
    "    \"\"\"\n",
    "    def __init__(self):\n",
    "        \"\"\"\n",
    "        Initialize the vector store.\n",
    "        \"\"\"\n",
    "        self.vectors = []  # List to store embedding vectors\n",
    "        self.texts = []  # List to store original texts\n",
    "        self.metadata = []  # List to store metadata for each text\n",
    "    \n",
    "    def add_item(self, text, embedding, metadata=None):\n",
    "        \"\"\"\n",
    "        Add an item to the vector store.\n",
    "\n",
    "        Args:\n",
    "        text (str): The original text.\n",
    "        embedding (List[float]): The embedding vector.\n",
    "        metadata (dict, optional): Additional metadata.\n",
    "        \"\"\"\n",
    "        self.vectors.append(np.array(embedding))  # Convert embedding to numpy array and add to vectors list\n",
    "        self.texts.append(text)  # Add the original text to texts list\n",
    "        self.metadata.append(metadata or {})  # Add metadata to metadata list, use empty dict if None\n",
    "    \n",
    "    def similarity_search(self, query_embedding, k=5):\n",
    "        \"\"\"\n",
    "        Find the most similar items to a query embedding.\n",
    "\n",
    "        Args:\n",
    "        query_embedding (List[float]): Query embedding vector.\n",
    "        k (int): Number of results to return.\n",
    "\n",
    "        Returns:\n",
    "        List[Dict]: Top k most similar items with their texts and metadata.\n",
    "        \"\"\"\n",
    "        if not self.vectors:\n",
    "            return []  # Return empty list if no vectors are stored\n",
    "        \n",
    "        # Convert query embedding to numpy array\n",
    "        query_vector = np.array(query_embedding)\n",
    "        \n",
    "        # Calculate similarities using cosine similarity\n",
    "        similarities = []\n",
    "        for i, vector in enumerate(self.vectors):\n",
    "            # Compute cosine similarity between query vector and stored vector\n",
    "            similarity = np.dot(query_vector, vector) / (np.linalg.norm(query_vector) * np.linalg.norm(vector))\n",
    "            similarities.append((i, similarity))  # Append index and similarity score\n",
    "        \n",
    "        # Sort by similarity (descending)\n",
    "        similarities.sort(key=lambda x: x[1], reverse=True)\n",
    "        \n",
    "        # Return top k results\n",
    "        results = []\n",
    "        for i in range(min(k, len(similarities))):\n",
    "            idx, score = similarities[i]\n",
    "            results.append({\n",
    "                \"text\": self.texts[idx],  # Add the corresponding text\n",
    "                \"metadata\": self.metadata[idx],  # Add the corresponding metadata\n",
    "                \"similarity\": score  # Add the similarity score\n",
    "            })\n",
    "        \n",
    "        return results  # Return the list of top k similar items"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Creating Embeddings"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "def create_embeddings(text, model=\"bge-m3:latest\"):\n",
    "    \"\"\"\n",
    "    Creates embeddings for the given text using the specified OpenAI model.\n",
    "\n",
    "    Args:\n",
    "    text (str): The input text for which embeddings are to be created.\n",
    "    model (str): The model to be used for creating embeddings.\n",
    "\n",
    "    Returns:\n",
    "    List[float]: The embedding vector.\n",
    "    \"\"\"\n",
    "    # Handle both string and list inputs by converting string input to a list\n",
    "    input_text = text if isinstance(text, list) else [text]\n",
    "    \n",
    "    # Create embeddings for the input text using the specified model\n",
    "    response = client.embeddings.create(\n",
    "        model=model,\n",
    "        input=input_text\n",
    "    )\n",
    "    \n",
    "    # If input was a string, return just the first embedding\n",
    "    if isinstance(text, str):\n",
    "        return response.data[0].embedding\n",
    "    \n",
    "    # Otherwise, return all embeddings as a list of vectors\n",
    "    return [item.embedding for item in response.data]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Document Processing Pipeline\n",
    "Now that we have defined the necessary functions and classes, we can proceed to define the document processing pipeline."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "def process_document(pdf_path, chunk_size=1000, chunk_overlap=200):\n",
    "    \"\"\"\n",
    "    Process a document for RAG.\n",
    "\n",
    "    Args:\n",
    "    pdf_path (str): Path to the PDF file.\n",
    "    chunk_size (int): Size of each chunk in characters.\n",
    "    chunk_overlap (int): Overlap between chunks in characters.\n",
    "\n",
    "    Returns:\n",
    "    SimpleVectorStore: A vector store containing document chunks and their embeddings.\n",
    "    \"\"\"\n",
    "    # Extract text from the PDF file\n",
    "    print(\"Extracting text from PDF...\")\n",
    "    extracted_text = extract_text_from_pdf(pdf_path)\n",
    "    \n",
    "    # Chunk the extracted text\n",
    "    print(\"Chunking text...\")\n",
    "    chunks = chunk_text(extracted_text, chunk_size, chunk_overlap)\n",
    "    print(f\"Created {len(chunks)} text chunks\")\n",
    "    \n",
    "    # Create embeddings for the text chunks\n",
    "    print(\"Creating embeddings for chunks...\")\n",
    "    chunk_embeddings = create_embeddings(chunks)\n",
    "    \n",
    "    # Initialize a simple vector store\n",
    "    store = SimpleVectorStore()\n",
    "    \n",
    "    # Add each chunk and its embedding to the vector store\n",
    "    for i, (chunk, embedding) in enumerate(zip(chunks, chunk_embeddings)):\n",
    "        store.add_item(\n",
    "            text=chunk,\n",
    "            embedding=embedding,\n",
    "            metadata={\"index\": i, \"source\": pdf_path}\n",
    "        )\n",
    "    \n",
    "    print(f\"Added {len(chunks)} chunks to the vector store\")\n",
    "    return store"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Implementing LLM-based Reranking\n",
    "Let's implement the LLM-based reranking function using the OpenAI API."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "def rerank_with_llm(query, results, top_n=3, model=\"qwen2.5:7b\"):\n",
    "    \"\"\"\n",
    "    Reranks search results using LLM relevance scoring.\n",
    "    \n",
    "    Args:\n",
    "        query (str): User query\n",
    "        results (List[Dict]): Initial search results\n",
    "        top_n (int): Number of results to return after reranking\n",
    "        model (str): Model to use for scoring\n",
    "        \n",
    "    Returns:\n",
    "        List[Dict]: Reranked results\n",
    "    \"\"\"\n",
    "    print(f\"Reranking {len(results)} documents...\")  # Print the number of documents to be reranked\n",
    "    \n",
    "    scored_results = []  # Initialize an empty list to store scored results\n",
    "    \n",
    "    # Define the system prompt for the LLM\n",
    "    system_prompt = \"\"\"You are an expert at evaluating document relevance for search queries.\n",
    "Your task is to rate documents on a scale from 0 to 10 based on how well they answer the given query.\n",
    "\n",
    "Guidelines:\n",
    "- Score 0-2: Document is completely irrelevant\n",
    "- Score 3-5: Document has some relevant information but doesn't directly answer the query\n",
    "- Score 6-8: Document is relevant and partially answers the query\n",
    "- Score 9-10: Document is highly relevant and directly answers the query\n",
    "\n",
    "You MUST respond with ONLY a single integer score between 0 and 10. Do not include ANY other text.\"\"\"\n",
    "    \n",
    "    # Iterate through each result\n",
    "    for i, result in enumerate(results):\n",
    "        # Show progress every 5 documents\n",
    "        if i % 5 == 0:\n",
    "            print(f\"Scoring document {i+1}/{len(results)}...\")\n",
    "        \n",
    "        # Define the user prompt for the LLM\n",
    "        user_prompt = f\"\"\"Query: {query}\n",
    "\n",
    "Document:\n",
    "{result['text']}\n",
    "\n",
    "Rate this document's relevance to the query on a scale from 0 to 10:\"\"\"\n",
    "        \n",
    "        # Get the LLM response\n",
    "        response = client.chat.completions.create(\n",
    "            model=model,\n",
    "            temperature=0,\n",
    "            messages=[\n",
    "                {\"role\": \"system\", \"content\": system_prompt},\n",
    "                {\"role\": \"user\", \"content\": user_prompt}\n",
    "            ]\n",
    "        )\n",
    "        \n",
    "        # Extract the score from the LLM response\n",
    "        score_text = response.choices[0].message.content.strip()\n",
    "        \n",
    "        # Use regex to extract the numerical score\n",
    "        score_match = re.search(r'\\b(10|[0-9])\\b', score_text)\n",
    "        if score_match:\n",
    "            score = float(score_match.group(1))\n",
    "        else:\n",
    "            # If score extraction fails, use similarity score as fallback\n",
    "            print(f\"Warning: Could not extract score from response: '{score_text}', using similarity score instead\")\n",
    "            score = result[\"similarity\"] * 10\n",
    "        \n",
    "        # Append the scored result to the list\n",
    "        scored_results.append({\n",
    "            \"text\": result[\"text\"],\n",
    "            \"metadata\": result[\"metadata\"],\n",
    "            \"similarity\": result[\"similarity\"],\n",
    "            \"relevance_score\": score\n",
    "        })\n",
    "    \n",
    "    # Sort results by relevance score in descending order\n",
    "    reranked_results = sorted(scored_results, key=lambda x: x[\"relevance_score\"], reverse=True)\n",
    "    \n",
    "    # Return the top_n results\n",
    "    return reranked_results[:top_n]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Simple Keyword-based Reranking"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "def rerank_with_keywords(query, results, top_n=3):\n",
    "    \"\"\"\n",
    "    A simple alternative reranking method based on keyword matching and position.\n",
    "    \n",
    "    Args:\n",
    "        query (str): User query\n",
    "        results (List[Dict]): Initial search results\n",
    "        top_n (int): Number of results to return after reranking\n",
    "        \n",
    "    Returns:\n",
    "        List[Dict]: Reranked results\n",
    "    \"\"\"\n",
    "    # Extract important keywords from the query\n",
    "    keywords = [word.lower() for word in query.split() if len(word) > 3]\n",
    "    \n",
    "    scored_results = []  # Initialize a list to store scored results\n",
    "    \n",
    "    for result in results:\n",
    "        document_text = result[\"text\"].lower()  # Convert document text to lowercase\n",
    "        \n",
    "        # Base score starts with vector similarity\n",
    "        base_score = result[\"similarity\"] * 0.5\n",
    "        \n",
    "        # Initialize keyword score\n",
    "        keyword_score = 0\n",
    "        for keyword in keywords:\n",
    "            if keyword in document_text:\n",
    "                # Add points for each keyword found\n",
    "                keyword_score += 0.1\n",
    "                \n",
    "                # Add more points if keyword appears near the beginning\n",
    "                first_position = document_text.find(keyword)\n",
    "                if first_position < len(document_text) / 4:  # In the first quarter of the text\n",
    "                    keyword_score += 0.1\n",
    "                \n",
    "                # Add points for keyword frequency\n",
    "                frequency = document_text.count(keyword)\n",
    "                keyword_score += min(0.05 * frequency, 0.2)  # Cap at 0.2\n",
    "        \n",
    "        # Calculate the final score by combining base score and keyword score\n",
    "        final_score = base_score + keyword_score\n",
    "        \n",
    "        # Append the scored result to the list\n",
    "        scored_results.append({\n",
    "            \"text\": result[\"text\"],\n",
    "            \"metadata\": result[\"metadata\"],\n",
    "            \"similarity\": result[\"similarity\"],\n",
    "            \"relevance_score\": final_score\n",
    "        })\n",
    "    \n",
    "    # Sort results by final relevance score in descending order\n",
    "    reranked_results = sorted(scored_results, key=lambda x: x[\"relevance_score\"], reverse=True)\n",
    "    \n",
    "    # Return the top_n results\n",
    "    return reranked_results[:top_n]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Response Generation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "def generate_response(query, context, model=\"qwen2.5:7b\"):\n",
    "    \"\"\"\n",
    "    Generates a response based on the query and context.\n",
    "    \n",
    "    Args:\n",
    "        query (str): User query\n",
    "        context (str): Retrieved context\n",
    "        model (str): Model to use for response generation\n",
    "        \n",
    "    Returns:\n",
    "        str: Generated response\n",
    "    \"\"\"\n",
    "    # Define the system prompt to guide the AI's behavior\n",
    "    system_prompt = \"You are a helpful AI assistant. Answer the user's question based only on the provided context. If you cannot find the answer in the context, state that you don't have enough information.\"\n",
    "    \n",
    "    # Create the user prompt by combining the context and query\n",
    "    user_prompt = f\"\"\"\n",
    "        Context:\n",
    "        {context}\n",
    "\n",
    "        Question: {query}\n",
    "\n",
    "        Please provide a comprehensive answer based only on the context above.\n",
    "    \"\"\"\n",
    "    \n",
    "    # Generate the response using the specified model\n",
    "    response = client.chat.completions.create(\n",
    "        model=model,\n",
    "        temperature=0,\n",
    "        messages=[\n",
    "            {\"role\": \"system\", \"content\": system_prompt},\n",
    "            {\"role\": \"user\", \"content\": user_prompt}\n",
    "        ]\n",
    "    )\n",
    "    \n",
    "    # Return the generated response content\n",
    "    return response.choices[0].message.content"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Full RAG Pipeline with Reranking\n",
    "So far, we have implemented the core components of the RAG pipeline, including document processing, question answering, and reranking. Now, we will combine these components to create a full RAG pipeline."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [],
   "source": [
    "def rag_with_reranking(query, vector_store, reranking_method=\"llm\", top_n=3, model=\"qwen2.5:7b\"):\n",
    "    \"\"\"\n",
    "    Complete RAG pipeline incorporating reranking.\n",
    "    \n",
    "    Args:\n",
    "        query (str): User query\n",
    "        vector_store (SimpleVectorStore): Vector store\n",
    "        reranking_method (str): Method for reranking ('llm' or 'keywords')\n",
    "        top_n (int): Number of results to return after reranking\n",
    "        model (str): Model for response generation\n",
    "        \n",
    "    Returns:\n",
    "        Dict: Results including query, context, and response\n",
    "    \"\"\"\n",
    "    # Create query embedding\n",
    "    query_embedding = create_embeddings(query)\n",
    "    \n",
    "    # Initial retrieval (get more than we need for reranking)\n",
    "    initial_results = vector_store.similarity_search(query_embedding, k=10)\n",
    "    \n",
    "    # Apply reranking\n",
    "    if reranking_method == \"llm\":\n",
    "        reranked_results = rerank_with_llm(query, initial_results, top_n=top_n)\n",
    "    elif reranking_method == \"keywords\":\n",
    "        reranked_results = rerank_with_keywords(query, initial_results, top_n=top_n)\n",
    "    else:\n",
    "        # No reranking, just use top results from initial retrieval\n",
    "        reranked_results = initial_results[:top_n]\n",
    "    \n",
    "    # Combine context from reranked results\n",
    "    context = \"\\n\\n===\\n\\n\".join([result[\"text\"] for result in reranked_results])\n",
    "    \n",
    "    # Generate response based on context\n",
    "    response = generate_response(query, context, model)\n",
    "    \n",
    "    return {\n",
    "        \"query\": query,\n",
    "        \"reranking_method\": reranking_method,\n",
    "        \"initial_results\": initial_results[:top_n],\n",
    "        \"reranked_results\": reranked_results,\n",
    "        \"context\": context,\n",
    "        \"response\": response\n",
    "    }"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Evaluating Reranking Quality"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Load the validation data from a JSON file\n",
    "with open('data/val.json') as f:\n",
    "    data = json.load(f)\n",
    "\n",
    "# Extract the first query from the validation data\n",
    "query = data[0]['question']\n",
    "\n",
    "# Extract the reference answer from the validation data\n",
    "reference_answer = data[0]['ideal_answer']\n",
    "\n",
    "# pdf_path\n",
    "pdf_path = \"data/AI_Information.pdf\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Extracting text from PDF...\n",
      "Chunking text...\n",
      "Created 42 text chunks\n",
      "Creating embeddings for chunks...\n",
      "Added 42 chunks to the vector store\n",
      "Comparing retrieval methods...\n",
      "\n",
      "=== STANDARD RETRIEVAL ===\n",
      "\n",
      "Query: Does AI have the potential to transform the way we live and work?\n",
      "\n",
      "Response:\n",
      "Yes, AI has significant potential to transform both our lives and work in various ways as detailed in the provided context. In terms of work, AI is already transforming business operations across different industries by increasing efficiency, reducing costs, and improving decision-making through automation and data analysis. It also supports human-AI collaboration, where AI tools augment human capabilities, automate mundane tasks, and provide insights that aid in decision-making processes.\n",
      "\n",
      "Moreover, the context suggests that AI will create new job roles such as AI development, data science, AI ethics, and AI training, which require specialized skills and expertise. This indicates a shift in the workforce landscape, necessitating reskilling and upskilling initiatives to adapt to these changes.\n",
      "\n",
      "In terms of daily life, while not explicitly detailed in the provided context, it can be inferred that advancements like AI-generated art and music, as well as AI's role in customer relationship management (CRM) systems, suggest a transformation in how we interact with technology and services. The potential for AI to enhance personal experiences through personalized interactions and predictive analytics further points towards a significant impact on daily life.\n",
      "\n",
      "Overall, the context indicates that AI is poised to play a crucial role in reshaping both professional environments and everyday living, highlighting its transformative capabilities across multiple sectors.\n",
      "\n",
      "=== LLM-BASED RERANKING ===\n",
      "Reranking 10 documents...\n",
      "Scoring document 1/10...\n",
      "Scoring document 6/10...\n",
      "\n",
      "Query: Does AI have the potential to transform the way we live and work?\n",
      "\n",
      "Response:\n",
      "Yes, AI has significant potential to transform both our lives and work in various ways as detailed in the provided context. In terms of work, AI is already transforming business operations across different industries by increasing efficiency, reducing costs, and improving decision-making through automation and data analysis. It also supports human-AI collaboration, where AI tools augment human capabilities, automate mundane tasks, and provide insights that aid in decision-making processes.\n",
      "\n",
      "Moreover, the context suggests that AI will create new job roles such as AI development, data science, AI ethics, and AI training, which require specialized skills and expertise. This indicates a shift in the workforce landscape, necessitating reskilling and upskilling initiatives to adapt to these changes.\n",
      "\n",
      "In terms of daily life, while not explicitly detailed in the provided context, it can be inferred that advancements like AI-generated art and music, as well as AI's role in customer relationship management (CRM) systems, suggest a transformation in how we interact with technology and services. The potential for AI to enhance personal experiences through personalized interactions and predictive analytics further points towards a significant impact on daily life.\n",
      "\n",
      "Overall, the context indicates that AI is poised to play a crucial role in reshaping both professional environments and everyday living, highlighting its transformative capabilities across multiple sectors.\n",
      "\n",
      "=== KEYWORD-BASED RERANKING ===\n",
      "\n",
      "Query: Does AI have the potential to transform the way we live and work?\n",
      "\n",
      "Response:\n",
      "Yes, AI has significant potential to transform both our lives and work. The integration of AI with robotics allows for more complex task performance, adaptability in changing environments, and improved human interaction. In business and industry, AI is enhancing operations through increased efficiency and cost reduction by automating tasks and providing data-driven insights.\n",
      "\n",
      "In the workplace, AI is already transforming roles and creating new opportunities. While there are concerns about job displacement due to automation of repetitive or routine tasks, reskilling and upskilling initiatives can help workers adapt. The future likely involves more collaboration between humans and AI systems, with AI augmenting human capabilities rather than completely replacing them.\n",
      "\n",
      "AI also impacts our daily lives through its applications in various sectors such as finance (e.g., algorithmic trading), customer service (e.g., chatbots), healthcare (e.g., robotic assistance), manufacturing, logistics, and more. These technologies enable personalized experiences, improved decision-making, and enhanced operational efficiency across different industries.\n",
      "\n",
      "Overall, the context suggests that AI is poised to significantly reshape both our professional lives and personal experiences, offering new opportunities while also presenting challenges that need to be managed responsibly.\n"
     ]
    }
   ],
   "source": [
    "# Process document\n",
    "vector_store = process_document(pdf_path)\n",
    "\n",
    "# Example query\n",
    "query = \"Does AI have the potential to transform the way we live and work?\"\n",
    "\n",
    "# Compare different methods\n",
    "print(\"Comparing retrieval methods...\")\n",
    "\n",
    "# 1. Standard retrieval (no reranking)\n",
    "print(\"\\n=== STANDARD RETRIEVAL ===\")\n",
    "standard_results = rag_with_reranking(query, vector_store, reranking_method=\"none\")\n",
    "print(f\"\\nQuery: {query}\")\n",
    "print(f\"\\nResponse:\\n{standard_results['response']}\")\n",
    "\n",
    "# 2. LLM-based reranking\n",
    "print(\"\\n=== LLM-BASED RERANKING ===\")\n",
    "llm_results = rag_with_reranking(query, vector_store, reranking_method=\"llm\")\n",
    "print(f\"\\nQuery: {query}\")\n",
    "print(f\"\\nResponse:\\n{llm_results['response']}\")\n",
    "\n",
    "# 3. Keyword-based reranking\n",
    "print(\"\\n=== KEYWORD-BASED RERANKING ===\")\n",
    "keyword_results = rag_with_reranking(query, vector_store, reranking_method=\"keywords\")\n",
    "print(f\"\\nQuery: {query}\")\n",
    "print(f\"\\nResponse:\\n{keyword_results['response']}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [],
   "source": [
    "def evaluate_reranking(query, standard_results, reranked_results, reference_answer=None):\n",
    "    \"\"\"\n",
    "    Evaluates the quality of reranked results compared to standard results.\n",
    "    \n",
    "    Args:\n",
    "        query (str): User query\n",
    "        standard_results (Dict): Results from standard retrieval\n",
    "        reranked_results (Dict): Results from reranked retrieval\n",
    "        reference_answer (str, optional): Reference answer for comparison\n",
    "        \n",
    "    Returns:\n",
    "        str: Evaluation output\n",
    "    \"\"\"\n",
    "    # Define the system prompt for the AI evaluator\n",
    "    system_prompt = \"\"\"You are an expert evaluator of RAG systems.\n",
    "    Compare the retrieved contexts and responses from two different retrieval methods.\n",
    "    Assess which one provides better context and a more accurate, comprehensive answer.\"\"\"\n",
    "    \n",
    "    # Prepare the comparison text with truncated contexts and responses\n",
    "    comparison_text = f\"\"\"Query: {query}\n",
    "\n",
    "Standard Retrieval Context:\n",
    "{standard_results['context'][:1000]}... [truncated]\n",
    "\n",
    "Standard Retrieval Answer:\n",
    "{standard_results['response']}\n",
    "\n",
    "Reranked Retrieval Context:\n",
    "{reranked_results['context'][:1000]}... [truncated]\n",
    "\n",
    "Reranked Retrieval Answer:\n",
    "{reranked_results['response']}\"\"\"\n",
    "\n",
    "    # If a reference answer is provided, include it in the comparison text\n",
    "    if reference_answer:\n",
    "        comparison_text += f\"\"\"\n",
    "        \n",
    "Reference Answer:\n",
    "{reference_answer}\"\"\"\n",
    "\n",
    "    # Create the user prompt for the AI evaluator\n",
    "    user_prompt = f\"\"\"\n",
    "{comparison_text}\n",
    "\n",
    "Please evaluate which retrieval method provided:\n",
    "1. More relevant context\n",
    "2. More accurate answer\n",
    "3. More comprehensive answer\n",
    "4. Better overall performance\n",
    "\n",
    "Provide a detailed analysis with specific examples.\n",
    "\"\"\"\n",
    "    \n",
    "    # Generate the evaluation response using the specified model\n",
    "    response = client.chat.completions.create(\n",
    "        model=\"qwen2.5:7b\",\n",
    "        temperature=0,\n",
    "        messages=[\n",
    "            {\"role\": \"system\", \"content\": system_prompt},\n",
    "            {\"role\": \"user\", \"content\": user_prompt}\n",
    "        ]\n",
    "    )\n",
    "    \n",
    "    # Return the evaluation output\n",
    "    return response.choices[0].message.content"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "=== EVALUATION RESULTS ===\n",
      "### Evaluation of Retrieval Methods\n",
      "\n",
      "#### 1. Relevance to Context and Query\n",
      "- **Standard Retrieval Method:**\n",
      "  - The provided context is highly relevant as it covers various aspects of AI's impact on both work and daily life, including business operations, job displacement, reskilling initiatives, human-AI collaboration, and the creation of new job roles.\n",
      "  - It directly addresses the query by discussing how AI can transform our lives and work.\n",
      "\n",
      "- **Reranked Retrieval Method:**\n",
      "  - The context is also relevant but seems to be a rephrasing or rearrangement of the same information. There are no additional insights or details that significantly enhance relevance compared to the standard method.\n",
      "  - It covers similar topics such as AI in business operations, job displacement, reskilling initiatives, and human-AI collaboration.\n",
      "\n",
      "#### 2. Accuracy\n",
      "- **Standard Retrieval Method:**\n",
      "  - The answer is accurate and comprehensive. It correctly identifies how AI can transform work through automation, data analysis, and decision-making support.\n",
      "  - It also addresses the potential for new job roles and the need for reskilling initiatives, which are relevant to the query.\n",
      "\n",
      "- **Reranked Retrieval Method:**\n",
      "  - The answer is accurate but lacks some of the specific details provided in the standard method. For example, it does not mention AI's role in customer service or financial processes as explicitly.\n",
      "  - It also omits the discussion on how AI can enhance personal experiences and daily life.\n",
      "\n",
      "#### 3. Comprehensiveness\n",
      "- **Standard Retrieval Method:**\n",
      "  - The answer is more comprehensive because it covers a broader range of topics related to AI's impact, including business operations, job displacement, reskilling initiatives, human-AI collaboration, and the creation of new job roles.\n",
      "  - It also provides an inference about how AI can transform daily life through personalized interactions and predictive analytics.\n",
      "\n",
      "- **Reranked Retrieval Method:**\n",
      "  - The answer is less comprehensive as it does not cover all the topics mentioned in the standard method. For instance, it does not discuss AI's role in financial processes or customer service.\n",
      "  - It also lacks the inference about how AI can enhance personal experiences and daily life.\n",
      "\n",
      "#### 4. Overall Performance\n",
      "- **Standard Retrieval Method:**\n",
      "  - The overall performance is better because it provides a more detailed and comprehensive answer that directly addresses the query comprehensively.\n",
      "  - It offers a broader perspective on both professional environments and everyday living, making it more informative and useful for understanding AI's potential impact.\n",
      "\n",
      "- **Reranked Retrieval Method:**\n",
      "  - While still relevant and accurate, the overall performance is slightly lower due to its lack of additional details and comprehensive coverage. The rephrasing does not add significant value or new insights.\n",
      "\n",
      "### Detailed Analysis\n",
      "\n",
      "1. **Relevance to Context and Query:**\n",
      "   - Both methods are highly relevant but the standard method provides a more direct and complete response.\n",
      "   \n",
      "2. **Accuracy:**\n",
      "   - Both answers are accurate, but the standard retrieval method is slightly more detailed in its coverage of AI's impact.\n",
      "\n",
      "3. **Comprehensiveness:**\n",
      "   - The standard retrieval method outperforms the reranked version by covering a wider range of topics related to AI’s potential impacts.\n",
      "   \n",
      "4. **Overall Performance:**\n",
      "   - The standard retrieval method provides a better overall performance due to its comprehensive and detailed nature, making it more useful for understanding the full scope of AI's transformative capabilities.\n",
      "\n",
      "### Conclusion\n",
      "The standard retrieval method is superior in providing relevant context, accuracy, comprehensiveness, and overall performance compared to the reranked version. It offers a broader perspective on how AI can transform both professional environments and everyday life, making it a better choice for answering the query effectively.\n"
     ]
    }
   ],
   "source": [
    "# Evaluate the quality of reranked results compared to standard results\n",
    "evaluation = evaluate_reranking(\n",
    "    query=query,  # The user query\n",
    "    standard_results=standard_results,  # Results from standard retrieval\n",
    "    reranked_results=llm_results,  # Results from LLM-based reranking\n",
    "    reference_answer=reference_answer  # Reference answer for comparison\n",
    ")\n",
    "\n",
    "# Print the evaluation results\n",
    "print(\"\\n=== EVALUATION RESULTS ===\")\n",
    "print(evaluation)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "rag",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.11"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
